Robust speech recognition using VAD-measure-embedded decoder

نویسندگان

Tasuku Oonishi

Paul R. Dixon

Koji Iwano

Sadaoki Furui

چکیده

In a speech recognition system a Voice Activity Detector (VAD) is a crucial component for not only maintaining accuracy but also for reducing computational consumption. Front-end approaches which drop non-speech frames typically attempt to detect speech frames by utilizing speech/non-speech classification information such as the zero crossing rate or statistical models. These approaches discard the speech/non-speech classification information after voice detection. This paper proposes an approach that uses the speech/non-speech information to adjust the score of the recognition hypotheses. Experimental results show that our approach can improve the accuracy significantly and reduce computational consumption by combining the frontend method.

متن کامل

منابع مشابه

VAD-measure-embedded decoder with online model adaptation

We previously proposed a decoding method for automatic speech recognition utilizing hypothesis scores weighted by voice activity detection (VAD)-measures. This method uses two Gaussian mixture models (GMMs) to obtain confidence measures: one for speech, the other for non-speech. To achieve good search performance, we need to adapt the GMMs properly for input utterances and environmental noise. ...

متن کامل

A Low-Cost Robust Front-end for Embedded ASR System

In this paper we propose a low-cost robust MFCC feature extraction algorithm which combines noise reduction and voice activity detection (VAD) for automatic speech recognition (ASR) system of embedded applications. To remedy the effect of additive noise a magnitude spectrum subtraction method is used. A VAD is performed to distinguish speech signal from noise signal. It discriminates speech/non...

متن کامل

A Hybrid Hmm/traps Model for Robust

We present three voice activity detection (VAD) algorithms that are suitable for the off-line processing of noisy speech and compare their performance on SPINE-2 evaluation data using speech recognition error rate as the quality metric. One VAD system is a simple HMM-based segmenter that uses normalized log-energy and a degree of voicing measure as raw features. The other two VAD systems focus ...

متن کامل

A hybrid HMM/traps model for robust voice activity detection

متن کامل

Robust Speech Recognition in a Car Using a Microphone Array

Performance of automatic speech recognition relies on a vast amount of training speech data mostly recorded with little or no background noise. The performance degrades significantly with existence of background noise, which increases type mismatch between train and test environments. Speech enhancement techniques can reduce the amount of type mismatch by extracting reliable speech features fro...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

متن کامل

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Robust speech recognition using VAD-measure-embedded decoder

نویسندگان

چکیده

منابع مشابه

VAD-measure-embedded decoder with online model adaptation

A Low-Cost Robust Front-end for Embedded ASR System

A Hybrid Hmm/traps Model for Robust

A hybrid HMM/traps model for robust voice activity detection

Robust Speech Recognition in a Car Using a Microphone Array

عنوان ژورنال:

اشتراک گذاری